12 research outputs found

    Optimal Categorical Attribute Transformation for Granularity Change in Relational Databases for Binary Decision Problems in Educational Data Mining

    Full text link
    This paper presents an approach for transforming data granularity in hierarchical databases for binary decision problems by applying regression to categorical attributes at the lower grain levels. Attributes from a lower hierarchy entity in the relational database have their information content optimized through regression on the categories histogram trained on a small exclusive labelled sample, instead of the usual mode category of the distribution. The paper validates the approach on a binary decision task for assessing the quality of secondary schools focusing on how logistic regression transforms the students and teachers attributes into school attributes. Experiments were carried out on Brazilian schools public datasets via 10-fold cross-validation comparison of the ranking score produced also by logistic regression. The proposed approach achieved higher performance than the usual distribution mode transformation and equal to the expert weighing approach measured by the maximum Kolmogorov-Smirnov distance and the area under the ROC curve at 0.01 significance level.Comment: 5 pages, 2 figures, 2 table

    Storage Capacity of RAM-based Neural Networks: Pyramids

    No full text
    Recently the authors developed a modular approach to assess the storage capacity of RAM-based neural networks [1] that can be applied to any architecture. It is based on collisions of information during the learning process. It has already been applied to the GNU architecture. In this paper, the technique is applied to the pyramid. The results explain practical problems reported in the literature and agree with experimental data and theoretical results obtained by architecturespecific techniques developed by other research groups. The successful application of that modular approach to the pyramid and GNU architectures --- two out of the three most common ones --- show it to be a useful tool for RAM-based neural networks specification for practical applications. Keywords: RAM-based neural networks, storage capacity, information collision. 1 Introduction Hardware implementability is one of the main features of RAM-based neural networks. They use Random Access Memory (RAM) chips as neur..

    Analysis of the Storage Capacity of RAM-based Neural Networks

    No full text
    This paper presents a probabilistic approach based on collisions to assess the storage capacity of RAM-based neural networks. The analysis at neuron level provides the basis for evaluation of storage capacity in the architectures. The approach is tested in the GNU and pyramid networks. In the GNU as an auto-associative memory, the theoretical results fit well with Braga's experimental data and are more broadly applicable than Braga's and Wong & Sherrington's theoretical results. For the pyramid, the theoretical results fit well with Penny & Stonham's experimental data. We discuss the approximations and limitations of the approach. An important aspect of this approach is that the storage capacity of any network can be assessed for the specific data which it is going to deal with --- for any probability distribution. This is a tool to be used for "learning" the connections of RAM-based networks. Keywords: RAM-based neural networks, storage capacity, elementary probability. 1 Introductio..

    Recurrent neural networks with pRAMs

    No full text
    This paper introduces a pRAM (probabilistic RAM) node system for sequential pattern verification. It includes a recurrent network trained with reinforcement learning based on the current state training strategy to generate the target for the reward/penalty signal. The main issues concerning the architecture's applicability to sequential pattern verification are discussed and experiments illustrate its behaviour. Sequence similarity measures are defined in order to take advantage of the network properties. Finally, the results of preliminary experiments in signature verification are presented and future developments discussed. Keywords: pRAM node, reinforcement learning, recurrent network, attractor, Hamming distance, general neural unit. 1 Introduction The probabilistic RAM node was proposed in 1988 [12] and has reinforcement learning [13] as its main learning algorithm. An important point is that this neuron model, along with its learning algorithm, are hardware implementable [9], a..

    A new evolutionary method for time series forecasting

    No full text
    This paper presents a new method — the Time-delay Added Evolutionary Forecasting (TAEF) method — for time series prediction which performs an evolutionary search of the minimum necessary number of dimensions embedded in the problem for determining the characteristic phase space of the time series. The method proposed is inspired in F. Takens theorem and consists of an intelligent hybrid model composed of an artificial neural network (ANN) combined with a modified genetic algorithm (GA). Initially, the TAEF method finds the most fitted predictor model for representing the series and then performs a behavioral statistical test in order to adjust time phase distortions

    New frontiers in applied data mining

    No full text
    International audienceFive high-quality workshops were held at the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2009) in Bangkok, Thailand during April 27-30, 2009. There were 17, 6, 9, 4 and 5 accepted papers to be presented at the Pacific Asia Workshop on Intelligence and Security Informatics (PAISI 2009), the workshop on Advances and Issues in Biomedical Data Mining (AIBDM 2009), the workshop on Data Mining with Imbalanced Classes and Error Cost (ICEC 2009), the workshop on Open Source in Data Mining (OSDM 2009), and the workshop on Quality Issues, Measures of Interestingness and Evaluation of Data Mining Models (QIMIE 2009). One competition, PAKDD 2009 Data Mining Competition, and one local workshop, Thai Track Session, were arranged. From these workshops (except PAISI which published its works in separate LNCS proceedings), we selected two or three best papers for this LNCS publication. PAKDD is a major international conference in the areas of data mining (DM) and knowledge discovery in database (KDD). It provides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition and automatic scientific discovery,data visualization, causal induction and knowledge-based systems
    corecore